Effective Analysis, Characterization, and Detection of Malicious Activities on the Web
نویسنده
چکیده
The Web has evolved from a handful of static web pages to billions of dynamic and interactive web pages. This evolution has positively transformed the paradigm of communication, trading, and collaboration for the benefit of humanity. However, these invaluable benefits of the Web are shadowed by cyber-criminals who use the Web as a medium to perform malicious activities motivated by illegitimate benefits. Cyber-criminals often lure victims to visit malicious web pages, exploit vulnerabilities on victims’ devices, and then launch attacks that could lead to: stealing invaluable credentials of victims, downloading and installation of malware on victims’ devices, or complete compromise of victims’ devices to mount future attacks. While the current state-of-the-art is to detect malicious web pages is promising, it is yet limited in addressing the following three problems. First, for the sake of focused detection of certain class of malicious web pages, existing techniques are limited to partial analysis and characterization of attack payloads. Secondly, attacker-motivated and benign evolution of web page artifacts have challenged the resilience of existing detection techniques. The third problem is the prevalence and evolution of Exploit Kits used in spreading web-borne malware. In this dissertation, we present the approaches and the tools we developed to address these problems. To address the partial analysis and characterization of attack payloads, we propose a holistic and lightweight approach that combines static analysis and minimalistic emulation to analyze and detect malicious web pages. This approach leverages features from URL structure, HTML content, JavaScript executed on the client, and reputation of URLs on social networking websites to train multiple models, which are then used in confidence-weighted majority vote classifier to detect unknown web pages. Evaluation of the approach on a large corpus of web pages shows that the approach not only is precise enough in detecting malicious web pages with very low false signals but also does detection with a minimal performance penalty. To address the evolution of web page artifacts, we propose an evolutionaware approach that tunes detection models inline with the evolution of web page artifacts. Our approach takes advantage of evolutionary searching and optimization using Genetic Algorithm to decide the best combination of features and learning algorithms, i.e., models, as a function of detection accuracy and false signals. Evaluation of our approach suggests that it reduces false negatives by about 10% on a fairly large testing corpus of web pages. To tackle the prevalence of Exploit Kits on the Web, we first analyze source code and runtime behavior of several Exploit Kits in a contained setting. In addition, we analyze the behavior of live Exploit Kits on the Web in a contained environment. Combining the analysis results, we characterize Exploit Kits pertinent to their attack-centric and self-defense behaviors. Based on these behaviors, we draw distinguishing features to train classifiers used to detect URLs that are hosted by Exploit Kits. The evaluation of our classifiers on independent testing dataset shows that our approach is effective in precisely detecting malicious URLs linked with Exploit Kits with very low false positives.
منابع مشابه
Analyzing new features of infected web content in detection of malicious web pages
Recent improvements in web standards and technologies enable the attackers to hide and obfuscate infectious codes with new methods and thus escaping the security filters. In this paper, we study the application of machine learning techniques in detecting malicious web pages. In order to detect malicious web pages, we propose and analyze a novel set of features including HTML, JavaScript (jQuery...
متن کاملFeature-based Malicious URL and Attack Type Detection Using Multi-class Classification
Nowadays, malicious URLs are the common threat to the businesses, social networks, net-banking etc. Existing approaches have focused on binary detection i.e. either the URL is malicious or benign. Very few literature is found which focused on the detection of malicious URLs and their attack types. Hence, it becomes necessary to know the attack type and adopt an effective countermeasure. This pa...
متن کاملAnomaly-based Web Attack Detection: The Application of Deep Neural Network Seq2Seq With Attention Mechanism
Today, the use of the Internet and Internet sites has been an integrated part of the people’s lives, and most activities and important data are in the Internet websites. Thus, attempts to intrude into these websites have grown exponentially. Intrusion detection systems (IDS) of web attacks are an approach to protect users. But, these systems are suffering from such drawbacks as low accuracy in ...
متن کاملDetecting Bot Networks Based On HTTP And TLS Traffic Analysis
Abstract— Bot networks are a serious threat to cyber security, whose destructive behavior affects network performance directly. Detecting of infected HTTP communications is a big challenge because infected HTTP connections are clearly merged with other types of HTTP traffic. Cybercriminals prefer to use the web as a communication environment to launch application layer attacks and secretly enga...
متن کاملBeeID: intrusion detection in AODV-based MANETs using artificial Bee colony and negative selection algorithms
Mobile ad hoc networks (MANETs) are multi-hop wireless networks of mobile nodes constructed dynamically without the use of any fixed network infrastructure. Due to inherent characteristics of these networks, malicious nodes can easily disrupt the routing process. A traditional approach to detect such malicious network activities is to build a profile of the normal network traffic, and then iden...
متن کامل